# FP8 Dynamic Quantization
Qwen3 30B A3B FP8 Dynamic
FP8 dynamic quantization version based on Qwen/Qwen3-30B-A3B model, optimized for inference efficiency on Ampere architecture GPUs
Large Language Model
Transformers

Q
khajaphysist
403
2
Llama Joycaption Alpha Two Hf Llava FP8 Dynamic
MIT
This is an FP8 compressed version of the Llama JoyCaption Alpha Two model developed by fancyfeast, implemented using the llm-compressor tool and compatible with the vllm framework.
Image-to-Text English
L
JKCHSTR
248
1
Magnum V4 72b FP8 Dynamic
Apache-2.0
A large language model with 72B parameters fine - tuned based on Qwen2.5 - 72B - Instruct. It uses dynamic FP8 quantization technology to optimize inference efficiency and aims to reproduce the prose quality of Claude 3.
Large Language Model
Transformers English

M
Infermatic
2,106
2
Featured Recommended AI Models